Methanotrophic carbon metabolism pathway genes and enzymes

ABSTRACT

Genes have been isolated from a  Methylomonas  sp encoding enzymes in the carbon flux pathway. The genes encode a 2-keto-3-deoxy-6-phosphogluconate (KDPGA) and a fructose bisphosphate aldolase (FFBPA), as well as numerous other genes. The genes will be useful in C1 metabolizing microorganisms for the manipulation of the carbon flux pathway.

This application is a divisional of application Ser. No. 09/934,901filed Aug. 22, 2001 now U.S. Pat. No. 6,555,353 which claims the benefitof U.S. Provisional Application No. 60/229,906, filed Sep. 1, 2000.

FIELD OF THE INVENTION

The invention relates to the field of molecular biology andmicrobiology. More specifically, the invention relates to genes involvedin the conversion of hexose sugars into 3-carbon metabolites inmethanotrophic bacteria.

BACKGROUND OF THE INVENTION

Methanotrophic bacteria are defined by their ability to use methane astheir sole source of carbon and energy. Although methanol is an obligateintermediate in the oxidation of methane, the ability to grow onmethanol alone is highly variable among the obligate methanotrophs(Green, Peter. Taxonomy of Methylotrophic Bacteria. In: Methane andMethanol Utilizers (Biotechnology Handbooks 5) J. Colin Murrell andHoward Dalton eds. 1992 Pleanum Press NY. pp. 23-84)). The conversion ofC1 compounds to complex molecules with C—C bonds is difficult andexpensive by traditional chemical synthetic routes. Traditionally,methane is first converted to synthesis gas which is then used toproduce other small molecular weight industrial precursors. The basicproblem is activation of the methane molecule, a process which isthermodynamically very difficult to achieve by chemical means.Methanotrophs have proved useful mediators of this problem.

Methane monooxygenase is the enzyme required for the primary step inmethane activation and the product of this reaction is methanol (Murrellet al., Arch. Microbiol. (2000), 173(5-6), 325-332). This remarkablereaction occurs at ambient temperatures and pressures, whereas chemicaltransformation of methane to methanol requires temperatures of hundredsof degrees and high pressures (Grigoryan, E. A., Kinet. Catal. (1999),40(3), 350-363; WO 2000007718; U.S. Pat. No. 5,750,821). It is thisability to transform methane under ambient conditions, along with theabundance of methane, that makes the biotransformation of methane apotentially unique and valuable process.

The commercial applications of biotransformation of methane havehistorically fallen broadly into three categories: 1) Production ofsingle cell protein, (Villadsen, John, Recent Trends Chem. React. Eng.,[Proc. Int. Chem. React. Eng. Conf.], 2nd (1987), Volume 2, 320-33.Editor(s): Kulkarni, B. D.; Mashelkar, R. A.; Sharma, M. M. Publisher:Wiley East, New Delhi, India; Naguib, M., Proc. OAPEC Symp.Petroprotein, [Pap.] (1980), Meeting Date 1979, 253-77 Publisher: Organ.Arab Pet. Exporting Countries, Kuwait, Kuwait); 2) epoxidation ofalkenes for production of chemicals (U.S. Pat. No. 4,348,476); and 3)biodegradation of chlorinated pollutants (Tsien et al., Gas, Oil, Coal,Environ. Biotechnol. 2, [Pap. Int. IGT Symp. Gas, Oil, Coal, Environ.Biotechnol.], 2nd (1990), 83-104, Editor(s): Akin, Cavit; Smith, Jared.Publisher: Inst. Gas Technol., Chicago, Ill.; WO 9633821; Merkley etal., Biorem. Recalcitrant Org., [Pap. Int. In Situ On-Site Bioreclam.Symp.], 3rd (1995), 165-74. Editor(s): Hinchee, Robert E; Anderson,Daniel B.; Hoeppel, Ronald E. Publisher: Battelle Press, Columbus, Ohio;Meyer et al., Microb. Releases (1993), 2(1), 11-22). Only epoxidation ofalkenes has experienced little commercial success due to low productyields, toxicity of products and the large amount of cell mass requiredto generate product.

Methanotrophic cells can further build the oxidation products of methane(i.e. formaldehyde) into more complex molecules such as protein,carbohydrate and lipids. For example, under certain conditionsmethanotrophs are known to produce exopolysaccharides (Ivanova et al.,Mikrobiologiya (1988), 57(4), 600-5); Kilbane, John J., II Gas, Oil,Coal, Environ. Biotechnol. 3, [Pap. IGT's Int. Symp.], 3rd (1991),Meeting Date 1990, 207-26. Editor(s): Akin, Cavit; Smith, Jared.Publisher: IGT, Chicago, Ill.). Similarly, methanotrophs are known toaccumulate both isoprenoid compounds and carotenoid pigments of variouscarbon lengths (Urakami et al., J. Gen. Appl. Microbiol. (1986), 32(4),317-41). Although these compounds have been identified in methanotrophs,they have not been microbial platforms of choice for production becausethese organisms have very poorly developed genetic systems, therebylimiting metabolic engineering ability for chemicals.

A necessary prerequisite to metabolic engineering of methanotrophs is afull understanding, and optimization, of the carbon metabolism formaximum growth and/or product yield. In methanotrophic bacteria, methaneis converted to biomolecules via a cyclic set of reactions known as theribulose monophosphate pathway (RuMP) cycle. The RuMP pathway iscomprised of three phases, each phase being a series of enzymatic steps.The first phase (fixation) is the aldol condensation of three moleculesof C-1 (formaldehyde) with three molecules of pentose(ribulose-5-phospate) to form three molecules of a six-carbon sugar(fructose-6-phosphate) catalyzed by hexulose monophosphate synthase.This fixation phase is common to all methylotrophic bacteria using theRuMP pathway.

The second phase is termed “cleavage” and results in splitting of that6-carbon sugar into two 3-carbon molecules. This may be achieved via twopossible routes. Fructose-6-phosphate is either converted intofructose-1,6-biphosphate (FBP) by phosphofructokinase, and subsequentlycleaved by FBP aldolase (FBPA) to 3-carbon molecules, or oxidized to2-keto-3-deoxy-6-phosphogluconate (KDPG) and ultimately cleaved to3-carbon sugars by the enzyme catalyzed by KDPG aldolase. One of those3-carbon molecules is recycled back through the RuMP pathway and theother 3-carbon fragment is utilized for cell growth.

In the third phase (the “rearrangement” phase), the regeneration of 3molecules of ribulose-5-phosphate is accomplished from the two remainingmolecules of fructose-6-phosphate (from stage 1) and the one molecule ofthe 3-carbon sugar from stage 2. There are two possible routes toachieve the rearrangement. These routes in the rearrangement phasediffer in that they involve either transaldolase (TA) or sedoheptulose-1,7-bisphosphatase (SB Pase).

In methanotrophs and methylotrophs, the RuMP pathway may occur as one ofthree variants. These are the KDPGA/TA, FBPA/SBPase and FBPA/TApathways. However, only two of these variants are commonly found. Thesetwo pathways are the FBPA/TA (fructose bisphophotasealdolase/Transaldolase) or the KDPGA/TA (keto deoxy phosphgogluconatealdolase/transaldolase) pathway, wherein only the FBPA/TA pathway isexergonic (Dijkhuizen et al. (1992) The Physiology and biochemistry ofaerobic methanol-utilizing gram negative and gram positive bacteria. In:Methane and Methanol utilizers. P. 149-Colin Murrell and Howard Dalton,Plenum Press NY). Available literature suggests that obligatorymethanotrophic bacteria such as Methylomonas rely solely on the KDPGA/TApathway (Entner-Douderoff Pathway), while facultative methylotrophsutilize either the FBPA/SBPase or the FBPA/TA pathway (Dijkhuizen et al.supra). Energetically, this pathway is not as efficient as theEmbden-Meyerhof pathway and thus could result in lower cellularproduction yields, as compared to organisms that do use the latterpathway. Therefore, a more energy efficient carbon processing pathwaywould greatly enhance the commercial viability of the methanotrophicplatform for the generation of materials.

The problem to be solved therefore is to discover genes encoding a moreenergetically efficient carbon flux pathway that would enable amethanotrophic bacterial strain to better able to serve as a platformfor the production of proteins and carbon containing materials.Applicants have solved the stated problem by providing the genesencoding the carbon flux pathway in a strain of Methylomonas. Thispathway contains not only the expected elements of the Entner-DouderoffPathway (including the 2-keto-3-deoxy-6-phosphogluconate aldolase) butadditionally contains the elements of the more energy efficientEmbden-Meyerhof pathway, containing the fructose-1,6-biphosphatealdolase. This discovery will permit the engineering of methanotrophsand other organisms for the energy efficient conversion of single carbonsubstrates such as methane and methanol to commercially useful productsin the food and feed and materials industries.

SUMMARY OF THE INVENTION

The present invention provides an isolated nucleic acid moleculeencoding a Methylomonas sp carbon flux enzyme, selected from the groupconsisting of:

(a) an isolated nucleic acid molecule encoding the amino acid sequenceselected from the group consisting of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14,16, 18, and 20;

(b) an isolated nucleic acid molecule that hybridizes with (a) under thefollowing hybridization conditions: 0.1×SSC, 0.1% SDS, 65° C. and washedwith 2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1% SDS; and

(c) an isolated nucleic acid molecule that is complementary to (a) or(b).

Additionally the invention provides the gene products, encoded by thepresent invention and chimera made from the instant genes by operablylinking the instant genes to suitable regulatory sequences. Similarlythe invention provides transformed host cells expressing the instantgenes or their chimera.

The invention additionally provides a method of obtaining a nucleic acidfragment encoding a carbon flux enzyme comprising:

(a) probing a genomic library with the nucleic acid fragment of thepresent invention;

(b) identifying a DNA clone that hybridizes with the nucleic acidfragment of the present invention; and

(c) sequencing the genomic fragment that comprises the clone identifiedin step (b),

wherein the sequenced genomic fragment encodes a carbon flux enzyme.

Alternatively the invention provides a method of obtaining a nucleicacid fragment encoding a carbon flux enzyme comprising:

(a) synthesizing at least one oligonucleotide primer corresponding to aportion of the sequence selected from the group consisting of SEQ IDNO:1, 3, 5, 7, 9, 11, 13, 15, 17, and 19;

(b) amplifying an insert present in a cloning vector using theoligonucleotide primer of step (a);

wherein the amplified insert encodes a portion of an amino acid sequenceencoding a carbon flux enzyme.

In another embodiment the invention provides a method of altering carbonflow through a methanotrophic bacteria comprising, over-expressing atleast one carbon flux gene selected from the group consisting of SEQ IDNO:1, 3, 5, 7, 9, 11, 13, 15, 17 and 19 in a methanotrophic strain suchthat the carbon flow is altered through the strain.

Additionally the invention provides a mutated gene encoding a carbonflux enzyme having an altered biological activity produced by a methodcomprising the steps of:

-   -   (i) digesting a mixture of nucleotide sequences with restriction        endonucleases wherein said mixture comprises:        -   a) a native carbon flux gene;        -   b) a first population of nucleotide fragments which will            hybridize to said native carbon flux gene;        -   c) a second population of nucleotide fragments which will            not hybridize to said native carbon flux gene;    -   wherein a mixture of restriction fragments are produced;    -   (ii) denaturing said mixture of restriction fragments;

(iii) incubating the denatured said mixture of restriction fragments ofstep (ii) with a polymerase;

-   -   (iv) repeating steps (ii) and (iii) wherein a mutated carbon        flux gene is produced encoding a protein having an altered        biological activity.

BRIEF DESCRIPTION OF THE DRAWINGS SEQUENCE DESCRIPTIONS AND BIOLOGICALDEPOSITS

FIG. 1 is a schematic showing the enzyme catalyzed reaction of theEmbden-Meyerhof and the Entner-Douderoff carbon pathways present in theMethylomonas 16a strain.

The invention can be more fully understood from the following detaileddescription and the accompanying sequence descriptions which form a partof this application.

The following sequence descriptions and sequences listings attachedhereto comply with the rules governing nucleotide and/or amino acidsequence disclosures in patent applications as set forth in 37 C.F.R.§1.821-1.825. The Sequence Descriptions contain the one letter code fornucleotide sequence characters and the three letter codes for aminoacids as defined in conformity with the IUPAC-IYUB standards describedin Nucleic Acids Research 13:3021-3030 (1985) and in the BiochemicalJournal 219 (No. 2), 345-373 (1984) which are herein incorporated byreference. The symbols and format used for nucleotide and amino acidsequence data comply with the rules set forth in 37 C.F.R. §1.822.

SEQ ID Nucleic SEQ ID Description acid Peptide Transaldolase: CarbonFlux 1 2 Transaldolase: Carbon Flux 3 4 Fructose bisphosphate aldoslase:Carbon Flux 5 6 Fructose bisphosphate aldoslase: Carbon Flux 7 8KHG/KDPG Aldolase: Carbon Flux 9 10 Phosphoglucomutase: carbon Flux 1112 Glucose 6 phosphate isomerase: Carbon flux 13 14 Phosphofructokinasepyrophosphate dependent: 15 16 Carbon Flux 6-Phosphogluconatedehydratase: Carbon flux 17 18 Glucose 6 phosphate 1 dehydrogenase:Carbon Flux 19 20

Applicants made the following biological deposits under the terms of theBudapest Treaty on the International Recognition of the Deposit ofMicro-organisms for the Purposes of Patent Procedure:

Depositor International Identification Reference Depository DesignationDate of Deposit Methylomonas 16a ATCC PTA 2402 Aug. 21, 2000

DETAILED DESCRIPTION OF THE INVENTION

The invention relates to genes encoding enzymes in the carbon fluxpathway from a methanotrophic bacteria. The pathway contains genesencoding fructose-1,6-biphosphate aldolase (FBP aldolase) and apyrophosphate dependent phosphofructokinase pyrophosphate which areindicative of the Embden-Meyerhof pathway typically not found inmethanotrophs. The Embden-Meyerhof pathway is energetically morefavorable than the carbon flux pathway typically associated with theseorganisms. Additionally the invention provides genes encoding elementsof the Entner-Douderoff Pathway, which is typically found inmethanotrophic bacteria. These genes include 6-Phosphogluconatedehydratase, a glucose-6-phosphate-1-dehydrogenase, and a2-keto-3-deoxy-6-phosphogluconate aldolase. Common to both pathways arenew genes encoding a transaldolase and a phosphoglucomutase. Knowledgeof the sequence of the present genes will be useful for altering thecarbon flow in methanotrophs and other bacteria resulting in moreproductive bacterial fermentation platforms for the production ofchemicals and food and feed products.

In this disclosure, a number of terms and abbreviations are used. Thefollowing definitions are provided.

“Open reading frame” is abbreviated ORF.

“Polymerase chain reaction” is abbreviated PCR.

The term “a C1 carbon substrate” refers to any carbon-containingmolecule that lacks a carbon-carbon bond. Examples are methane,methanol, formaldehyde, formic acid, methylated amines, methylatedthiols.

The term “RuMP” is the abbreviation for ribulose monophosphate and the“RuMP pathway” refers to the set of enzymes found in methanotrophicbacteria responsible of the conversion of the methane monooxygenaseproduct (methanol, formaldehyde) to three carbon moieties useful forenergy production in the methanotroph.

The term “Embden-Meyerhof pathway” refers to the series of biochemicalreactions for conversion of hexoses such as glucose and fructose toimportant cellular 3-carbon intermediates such as glyceraldehyde 3phosphate, dihydroxyacetone phosphate, phosphoenol pyruvate andpyruvate. These reactions typically proceed with net yield ofbiochemically useful energy in the form of ATP. The key enzymes uniqueto the Embden-Meyerhof pathway are the phosphofructokinase and fructose1,6 bisphosphate aldolase.

The term “Entner-Douderoff pathway” refers to a series of biochemicalreactions for conversion of hexoses such as as glucose or fructose tothe important 3-carbon cellular intermediates pyruvate andglyceraldehyde 3 phosphate, without any net production of biochemicallyuseful energy. The key enzymes unique to the Entner-Douderoff pathwayare the 6 phosphogluconate dehydratase and the ketodeoxyphosphogluconatealdolase.

The term “high growth methanotrophic bacterial strain” refers to abacterium capable of growth with methane or methanol as the sole carbonand energy source and which possesses a functional Embden-Meyerhofcarbon flux pathway resulting in yield of cell mass per gram of C1substrate metabolized. The specific “high growth methanotrophicbacterial strain” described herein is referred to as “Methylomonas 16a”or “16a”, which terms are used interchangeably.

The term “methanotroph” or “methanotrophic bacteria” will refer to aprokaryotic microorganism capable of utilizing methane as its primarycarbon and energy source.

As used herein, an “isolated nucleic acid fragment” is a polymer of RNAor DNA that is single- or double-stranded, optionally containingsynthetic, non-natural or altered nucleotide bases. An isolated nucleicacid fragment in the form of a polymer of DNA may be comprised of one ormore segments of cDNA, genomic DNA or synthetic DNA.

The term “carbon flux gene” will refer to any gene encoding an enzymethat functions to convert C1 substrates in methanotrophic bacteria tometabolically useful products. As used herein “carbon flux genes” willbe those encoding a phosphoglucomutase, a transaldolase, aglucose-6-phosphate isomerase, a phosphofructokinase (pyrophosphatedependent), a 6-Phosphogluconate dehydratase, and a glucose 6 phosphate1 dehydrogenase, as well as the distinctive fructose bisphosphatealdolase and keto deoxy phosphogluconate aldolase.

“Carbon Flux enzymes” will refer to the gene products of the carbon fluxgenes.

The term “transaldolase” will be abbrevaited “TA” and will refer to anenzyme that catalyzes the reaction of sedoheptulose 7-phosphate andD-glyceraldehyde 3-phosphate to give D-erythrose 4-phosphate andD-fructose 6-phosphate

The term “fructose bisphosphate aldolase” will be abbreviated “FFBPA”and will refer to an enzyme that catalyzes the reaction of D-fructose1,6-bisphosphate to give glycerone-phosphate and D-glyceraldehyde3-phosphate.

The term “keto deoxy phosphogluconate aldolase” will be abbreviated“KDPGA” and will refer to an enzyme that catalyzes the reaction of2-dehydro-3-deoxy-d-gluconate 6-phosphate to give pyruvate andD-glyceraldehyde 3-phosphate.

The term “phosphoglucomutase” and will refer to an enzyme that catalyzesthe interconversion of glucose-6-phosphate to glucose-1-phosphate.

The term “glucose-6-phosphate isomerase” and will refer to an enzymethat catalyzes the conversion of fructose-6-phosphate toglucose-6-phosphate.

The term “phosphofructokinase” and will refer to an enzyme thatcatalyzes the conversion of fructose-6-phosphate tofructose-1,6-bisphosphate.

The term “6-phosphogluconate dehydratase” and will refer to an enzymethat catalyzes the conversion of 6-phosphogluconate to2-keto-3-deoxy-6-phosphogluconate (KDPG).

The term “6-phosphogluconate-6-phosphate-1 dehydrogenase” and will referto an enzyme that catalyzes the conversion of glucose-6-phosphate to6-phosphogluconate.

As used herein, “substantially similar” refers to nucleic acid fragmentswherein changes in one or more nucleotide bases results in substitutionof one or more amino acids, but do not affect the functional propertiesof the protein encoded by the DNA sequence. “Substantially similar” alsorefers to nucleic acid fragments wherein changes in one or morenucleotide bases does not affect the ability of the nucleic acidfragment to mediate alteration of gene expression by antisense orco-suppression technology. “Substantially similar” also refers tomodifications of the nucleic acid fragments of the instant inventionsuch as deletion or insertion of one or more nucleotide bases that donot substantially affect the functional properties of the resultingtranscript. It is therefore understood that the invention encompassesmore than the specific exemplary sequences.

For example, it is well known in the art that alterations in a genewhich result in the production of a chemically equivalent amino acid ata given site, but do not effect the functional properties of the encodedprotein are common. For the purposes of the present inventionsubstitutions are defined as exchanges within one of the following fivegroups:

-   -   1. Small aliphatic, nonpolar or slightly polar residues: Ala,        Ser, Thr (Pro, Gly);    -   2. Polar, negatively charged residues and their amides: Asp,        Asn, Glu, Gln;    -   3. Polar, positively charged residues: His, Arg, Lys;    -   4. Large aliphatic, nonpolar residues: Met, Leu, Ile, Val (Cys);        and    -   5. Large aromatic residues: Phe, Tyr, Trp.

Thus, a codon for the amino acid alanine, a hydrophobic amino acid, maybe substituted by a codon encoding another less hydrophobic residue(such as glycine) or a more hydrophobic residue (such as valine,leucine, or isoleucine). Similarly, changes which result in substitutionof one negatively charged residue for another (such as aspartic acid forglutamic acid) or one positively charged residue for another (such aslysine for arginine) can also be expected to produce a functionallyequivalent product.

In many cases, nucleotide changes which result in alteration of theN-terminal and C-terminal portions of the protein molecule would alsonot be expected to alter the activity of the protein.

Each of the proposed modifications is well within the routine skill inthe art, as is determination of retention of biological activity of theencoded products. Moreover, the skilled artisan recognizes thatsubstantially similar sequences encompassed by this invention are alsodefined by their ability to hybridize, under stringent conditions(0.1×SSC, 0.1% SDS, 65° C. and washed with 2×SSC, 0.1% SDS followed by0.1×SSC, 0.1% SDS), with the sequences exemplified herein. Preferredsubstantially similar nucleic acid fragments of the instant inventionare those nucleic acid fragments whose DNA sequences are at least 80%identical to the DNA sequence of the nucleic acid fragments reportedherein. More preferred nucleic acid fragments are at least 90% identicalto the DNA sequence of the nucleic acid fragments reported herein. Mostpreferred are nucleic acid fragments that are at least 95% identical tothe DNA sequence of the nucleic acid fragments reported herein.

A nucleic acid molecule is “hybridizable” to another nucleic acidmolecule, such as a cDNA, genomic DNA, or RNA, when a single strandedform of the nucleic acid molecule can anneal to the other nucleic acidmolecule under the appropriate conditions of temperature and solutionionic strength. Hybridization and washing conditions are well known andexemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. MolecularCloning: A Laboratory Manual, Second Edition, Cold Spring HarborLaboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 andTable 11.1 therein (entirely incorporated herein by reference). Theconditions of temperature and ionic strength determine the “stringency”of the hybridization. Stringency conditions can be adjusted to screenfor moderately similar fragments, such as homologous sequences fromdistantly related organisms, to highly similar fragments, such as genesthat duplicate functional enzymes from closely related organisms.Post-hybridization washes determine stringency conditions. One set ofpreferred conditions uses a series of washes starting with 6×SSC, 0.5%SDS at room temperature for 15 min, then repeated with 2×SSC, 0.5% SDSat 45° C. for 30 min, and then repeated twice with 0.2×SSC, 0.5% SDS at50° C. for 30 min. A more preferred set of stringent conditions useshigher temperatures in which the washes are identical to those aboveexcept for the temperature of the final two 30 min washes in 0.2×SSC,0.5% SDS was increased to 60° C. Another preferred set of highlystringent conditions uses two final washes in 0.1×SSC, 0.1% SDS at 65°C. Hybridization requires that the two nucleic acids containcomplementary sequences, although depending on the stringency of thehybridization, mismatches between bases are possible. The appropriatestringency for hybridizing nucleic acids depends on the length of thenucleic acids and the degree of complementation, variables well known inthe art. The greater the degree of similarity or homology between twonucleotide sequences, the greater the value of Tm for hybrids of nucleicacids having those sequences. The relative stability (corresponding tohigher Tm) of nucleic acid hybridizations decreases in the followingorder: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100nucleotides in length, equations for calculating Tm have been derived(see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorternucleic acids, i.e., oligonucleotides, the position of mismatchesbecomes more important, and the length of the oligonucleotide determinesits specificity (see Sambrook et al., supra, 11.7-11.8). In oneembodiment the length for a hybridizable nucleic acid is at least about10 nucleotides. Preferable a minimum length for a hybridizable nucleicacid is at least about 15 nucleotides; more preferably at least about 20nucleotides; and most preferably the length is at least 30 nucleotides.Furthermore, the skilled artisan will recognize that the temperature andwash solution salt concentration may be adjusted as necessary accordingto factors such as length of the probe.

A “substantial portion” of an amino acid or nucleotide sequencecomprising enough of the amino acid sequence of a polypeptide or thenucleotide sequence of a gene to putatively identify that polypeptide orgene, either by manual evaluation of the sequence by one skilled in theart, or by computer-automated sequence comparison and identificationusing algorithms such as BLAST (Basic Local Alignment Search Tool;Altschul, S. F., et al., (1993) J. Mol. Biol. 215:403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or morecontiguous amino acids or thirty or more nucleotides is necessary inorder to putatively identify a polypeptide or nucleic acid sequence ashomologous to a known protein or gene. Moreover, with respect tonucleotide sequences, gene specific oligonucleotide probes comprising20-30 contiguous nucleotides may be used in sequence-dependent methodsof gene identification (e.g., Southern hybridization) and isolation(e.g., in situ hybridization of bacterial colonies or bacteriophageplaques). In addition, short oligonucleotides of 12-15 bases may be usedas amplification primers in PCR in order to obtain a particular nucleicacid fragment comprising the primers. Accordingly, a “substantialportion” of a nucleotide sequence comprises enough of the sequence tospecifically identify and/or isolate a nucleic acid fragment comprisingthe sequence. The instant specification teaches partial or completeamino acid and nucleotide sequences encoding one or more particularmicrobial proteins. The skilled artisan, having the benefit of thesequences as reported herein, may now use all or a substantial portionof the disclosed sequences for purposes known to those skilled in thisart. Accordingly, the instant invention comprises the complete sequencesas reported in the accompanying Sequence Listing, as well as substantialportions of those sequences as defined above.

The term “complementary” is used to describe the relationship betweennucleotide bases that are capable to hybridizing to one another. Forexample, with respect to DNA, adenosine is complementary to thymine andcytosine is complementary to guanine. Accordingly, the instant inventionalso includes isolated nucleic acid fragments that are complementary tothe complete sequences as reported in the accompanying Sequence Listingas well as those substantially similar nucleic acid sequences.

The term “percent identity”, as known in the art, is a relationshipbetween two or more polypeptide sequences or two or more polynucleotidesequences, as determined by comparing the sequences. In the art,“identity” also means the degree of sequence relatedness betweenpolypeptide or polynucleotide sequences, as the case may be, asdetermined by the match between strings of such sequences. “Identity”and “similarity” can be readily calculated by known methods, includingbut not limited to those described in: Computational Molecular Biology(Lesk, A. M., ed.) Oxford University Press, New York (1988);Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.)Academic Press, New York (1993); Computer Analysis of Sequence Data,Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, NewJersey (1994); Sequence Analysis in Molecular Biology (von Heinje, G.,ed.) Academic Press (1987); and Sequence Analysis Primer (Gribskov, M.and Devereux, J., eds.) Stockton Press, NY (1991). Preferred methods todetermine identity are designed to give the best match between thesequences tested. Methods to determine identity and similarity arecodified in publicly available computer programs. Sequence alignmentsand percent identity calculations may be performed using the Megalignprogram of the LASERGENE bioinformatics computing suite (DNASTAR Inc.,Madison, Wis.). Multiple alignment of the sequences was performed usingthe Clustal method of alignment (Higgins and Sharp (1989) CABIOS.5:151-153) with the default parameters (GAP PENALTY=10, GAP LENGTHPENALTY=10). Default parameters for pairwise alignments using theClustal method were KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALSSAVED=5.

Suitable nucleic acid fragments (isolated polynucleotides of the presentinvention) encode polypeptides that are at least about 70% identical,preferably at least about 80% identical to the amino acid sequencesreported herein. Preferred nucleic acid fragments encode amino acidsequences that are about 85% identical to the amino acid sequencesreported herein. More preferred nucleic acid fragments encode amino acidsequences that are at least about 90% identical to the amino acidsequences reported herein. Most preferred are nucleic acid fragmentsthat encode amino acid sequences that are at least about 95% identicalto the amino acid sequences reported herein. Suitable nucleic acidfragments not only have the above homologies but typically encode apolypeptide having at least 50 amino acids, preferably at least 100amino acids, more preferably at least 150 amino acids, still morepreferably at least 200 amino acids, and most preferably at least 250amino acids.

“Codon degeneracy” refers to the nature in the genetic code permittingvariation of the nucleotide sequence without effecting the amino acidsequence of an encoded polypeptide. Accordingly, the instant inventionrelates to any nucleic acid fragment that encodes all or a substantialportion of the amino acid sequence encoding the instant microbialpolypeptides as set forth in SEQ ID NOs:2, 4, 6, 8, and 10. The skilledartisan is well aware of the “codon-bias” exhibited by a specific hostcell in usage of nucleotide codons to specify a given amino acid.Therefore, when synthesizing a gene for improved expression in a hostcell, it is desirable to design the gene such that its frequency ofcodon usage approaches the frequency of preferred codon usage of thehost cell.

“Synthetic genes” can be assembled from oligonucleotide building blocksthat are chemically synthesized using procedures known to those skilledin the art. These building blocks are ligated and annealed to form genesegments which are then enzymatically assembled to construct the entiregene. “Chemically synthesized”, as related to a sequence of DNA, meansthat the component nucleotides were assembled in vitro. Manual chemicalsynthesis of DNA may be accomplished using well-established procedures,or automated chemical synthesis can be performed using one of a numberof commercially available machines. Accordingly, the genes can betailored for optimal gene expression based on optimization of nucleotidesequence to reflect the codon bias of the host cell. The skilled artisanappreciates the likelihood of successful gene expression if codon usageis biased towards those codons favored by the host. Determination ofpreferred codons can be based on a survey of genes derived from the hostcell where sequence information is available.

“Gene” refers to a nucleic acid fragment that expresses a specificprotein, including regulatory sequences preceding (5′ non-codingsequences) and following (3′ non-coding sequences) the coding sequence.“Native gene” refers to a gene as found in nature with its ownregulatory sequences “Chimeric gene” refers to any gene that is not anative gene, comprising regulatory and coding sequences that are notfound together in nature. Accordingly, a chimeric gene may compriseregulatory sequences and coding sequences that are derived fromdifferent sources, or regulatory sequences and coding sequences derivedfrom the same source, but arranged in a manner different than that foundin nature. “Endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign” gene refers to a genenot normally found in the host organism, but that is introduced into thehost organism by gene transfer. Foreign genes can comprise native genesinserted into a non-native organism, or chimeric genes. A “transgene” isa gene that has been introduced into the genome by a transformationprocedure.

“Coding sequence” refers to a DNA sequence that codes for a specificamino acid sequence. “Suitable regulatory sequences” refer to nucleotidesequences located upstream (5′ non-coding sequences), within, ordownstream (3′ non-coding sequences) of a coding sequence, and whichinfluence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences may includepromoters, translation leader sequences, introns, polyadenylationrecognition sequences, RNA processing site, effector binding site andstem-loop structure.

“Promoter” refers to a DNA sequence capable of controlling theexpression of a coding sequence or functional RNA. In general, a codingsequence is located 3′ to a promoter sequence. Promoters may be derivedin their entirety from a native gene, or be composed of differentelements derived from different promoters found in nature, or evencomprise synthetic DNA segments. It is understood by those skilled inthe art that different promoters may direct the expression of a gene indifferent tissues or cell types, or at different stages of development,or in response to different environmental or physiological conditions.Promoters which cause a gene to be expressed in most cell types at mosttimes are commonly referred to as “constitutive promoters”. It isfurther recognized that since in most cases the exact boundaries ofregulatory sequences have not been completely defined, DNA fragments ofdifferent lengths may have identical promoter activity.

The “3′ non-coding sequences” refer to DNA sequences located downstreamof a coding sequence and include polyadenylation recognition sequencesand other sequences encoding regulatory signals capable of affectingmRNA processing or gene expression. The polyadenylation signal isusually characterized by affecting the addition of polyadenylic acidtracts to the 3′ end of the mRNA precursor.

“RNA transcript” refers to the product resulting from RNApolymerase-catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complementary copy of the DNA sequence, it isreferred to as the primary transcript or it may be an RNA sequencederived from post-transcriptional processing of the primary transcriptand is referred to as the mature RNA. “Messenger RNA (mRNA)” refers tothe RNA that is without introns and that can be translated into proteinby the cell. “cDNA” refers to a double-stranded DNA that iscomplementary to and derived from mRNA. “Sense” RNA refers to RNAtranscript that includes the mRNA and so can be translated into proteinby the cell. “Antisense RNA” refers to a RNA transcript that iscomplementary to all or part of a target primary transcript or mRNA andthat blocks the expression of a target gene (U.S. Pat. No. 5,107,065; WO9928508). The complementarity of an antisense RNA may be with any partof the specific gene transcript, i.e., at the 5′ non-coding sequence, 3′non-coding sequence, or the coding sequence. “Functional RNA” refers toantisense RNA, ribozyme RNA, or other RNA that is not translated yet hasan effect on cellular processes.

The term “operably linked” refers to the association of nucleic acidsequences on a single nucleic acid fragment so that the function of oneis affected by the other. For example, a promoter is operably linkedwith a coding sequence when it is capable of affecting the expression ofthat coding sequence (i.e., that the coding sequence is under thetranscriptional control of the promoter). Coding sequences can beoperably linked to regulatory sequences in sense or antisenseorientation.

The term “expression”, as used herein, refers to the transcription andstable accumulation of sense (mRNA) or antisense RNA derived from thenucleic acid fragment of the invention. Expression may also refer totranslation of mRNA into a polypeptide.

“Transformation” refers to the transfer of a nucleic acid fragment intothe genome of a host organism, resulting in genetically stableinheritance. Host organisms containing the transformed nucleic acidfragments are referred to as “transgenic” or “recombinant” or“transformed” organisms.

The terms “plasmid”, “vector” and “cassette” refer to an extrachromosomal element often carrying genes which are not part of thecentral metabolism of the cell, and usually in the form of circulardouble-stranded DNA molecules. Such elements may be autonomouslyreplicating sequences, genome integrating sequences, phage or nucleotidesequences, linear or circular, of a single- or double-stranded DNA orRNA, derived from any source, in which a number of nucleotide sequenceshave been joined or recombined into a unique construction which iscapable of introducing a promoter fragment and DNA sequence for aselected gene product along with appropriate 3′ untranslated sequenceinto a cell. “Transformation cassette” refers to a specific vectorcontaining a foreign gene and having elements in addition to the foreigngene that facilitate transformation of a particular host cell.“Expression cassette” refers to a specific vector containing a foreigngene and having elements in addition to the foreign gene that allow forenhanced expression of that gene in a foreign host.

The term “altered biological activity” will refer to an activity,associated with a protein encoded by a microbial nucleotide sequencewhich can be measured by an assay method, where that activity is eithergreater than or less than the activity associated with the nativemicrobial sequence. “Enhanced biological activity” refers to an alteredactivity that is greater than that associated with the native sequence.“Diminished biological activity” is an altered activity that is lessthan that associated with the native sequence.

The term “sequence analysis software” refers to any computer algorithmor software program that is useful for the analysis of nucleotide oramino acid sequences. “Sequence analysis software” may be commerciallyavailable or independently developed. Typical sequence analysis softwarewill include but is not limited to the GCG suite of programs (WisconsinPackage Version 9.0, Genetics Computer Group (GCG), Madison, Wis.),BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol. 215:403-410(1990), and DNASTAR (DNASTAR, Inc. 1228 S. Park St. Madison, Wis. 53715USA), and the FASTA program incorporating the Smith-Waterman algorithm(W. R. Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994),Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Publisher: Plenum,New York, N.Y.). Within the context of this application it will beunderstood that where sequence analysis software is used for analysis,that the results of the analysis will be based on the “default values”of the program referenced, unless otherwise specified. As used herein“default values” will mean any set of values or parameters whichoriginally load with the software when first initialized.

Standard recombinant DNA and molecular cloning techniques used here arewell known in the art and are described by Sambrook, J., Fritsch, E. F.and Maniatis, T., Molecular Cloning: A Laboratory Manual, SecondEdition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.(1989) (hereinafter “Maniatis”); and by Silhavy, T. J., Bennan, M. L.and Enquist, L. W., Experiments with Gene Fusions, Cold Spring HarborLaboratory Cold Press Spring Harbor, N.Y. (1984); and by Ausubel, F. M.et al., Current Protocols in Molecular Biology, published by GreenePublishing Assoc. and Wiley-Interscience (1987).

The invention provides genes and gene products involved in the carbonflux pathway of a Methylomonas sp. The invention alternatively providesmethods of altering carbon flux in a methanotrophic bacteria comprisingthe up-regulation or down-regulation of carbon flux either byintroducing the present genes into a host or by suppressing theexpression of sequence homologs to the present genes.

Isolation of Methylomonas 16a

The original environmental sample containing Methylomonas 16a wasobtained from pond sediment. The pond sediment was inoculated directlyinto a defined mineral medium under 25% methane in air. Methane was usedas the sole source of carbon and energy. Growth was followed until theoptical density at 660 nm was stable, whereupon the culture wastransferred to fresh medium such that a 1:100 dilution was achieved.After 3 successive transfers with methane as the sole carbon and energysource, the culture was plated onto defined minimal medium agar andincubated under 25% methane in air. Many methanotrophic bacterialspecies were isolated in this manner. However, Methylomonas 16a wasselected as the organism to study due to the rapid growth of colonies,large colony size, its ability to grow on minimal media, and pinkpigmentation indicative of an active biosynthetic pathway forcarotenoids.

Methanotrophs are classified into three metabolic groups (“Type I”,“Type X” or “Type II”) based on the mode of carbon incorporation,morphology, % GC content and the presence or absence of key specificenzymes. Example 4, Table 2 shows key traits determined for Methylomonas16a in relation to the three major groupings of methanotrophs. Thestrain clearly falls into the Type I grouping based on every trait, withthe exception of nitrogen fixation. Available literature suggests thatthese organisms do not fix nitrogen. Therefore, Methylomonas 16a appearsto be unique in this aspect of nitrogen metabolism.

16SrRNA extracted from the strain was sequenced and compared to known16SrRNAs from other microorganisms. The data showed 96% identity tosequences from Methylomonas sp. KSP III and Methylomonas sp. StrainLW13. Based on this evidence, as well as the other physiological traitsdescribed in Table 2, it was concluded that the strain was a member ofthe genus Methylomonas.

The present sequences have been identified by comparison of random cDNAsequences to the GenBank database using the BLAST algorithms well knownto those skilled in the art. The nucleotide sequence of two genesencoding fructose bisphosphate aldolase (FFBPA) have been identified.The gene sequences for these genes are given in SEQ ID NO:5 and SEQ IDNO:7. The corresponding gene products are given in SEQ ID NO:6 and SEQID NO:8. Similarly, two genes encoding a transaldolase associated withthe carbon flux pathway have been identified. These genes are set forthin SEQ ID NO:1 and SEQ ID NO:3. Their corresponding gene products areset forth in SEQ ID NO:2 and SEQ ID NO:4. Additionally a gene encoding aketo deoxy phosphogluconate aldolase (KDPGA) has been identified and isgiven in SEQ ID NO:9 and the deduced amino acid sequence of the geneproduct is given in SEQ ID NO:10.

Genes encoding a phosphoglucomutase have also been identified where thegenes and the corresponding gene products are given as SEQ ID NOs:11 and12, respectively. Similarly, genes and gene products have beenidentified encoding a glucose 6 phosphate isomerase where the genes andtheir corresponding gene products are given as SEQ ID NO:13 and 14,respectively. Genes encoding a phosphofructokinase have also beenidentified where the genes and gene products are given as SEQ ID NOs:15and 16, respectively. A 6-phosphogluconate dehydratase encoding gene hasbeen identified and the gene and gene product are given in SEQ ID NOs:17and 18, respectively. Another carbon flux enzyme, 6-phosphogluconate 6phosphate 1 dehydrogenase, has been identified and the gene and geneproduct is given in SEQ ID NOs:19 and 20, respectively.

Accordingly, the present invention provides a Methylomonas sp having agene encoding a fructose bisphosphate aldolase (FBP aldolase), a ketodeoxy phosphgogluconate/transaldolase (KDPG aldolase), aphosphoglucomutase, a glucose 6 phosphate isomerase, aphosphofructokinase, a 6-phosphogluconate dehydratase, and a6-phosphogluconate-6-phosphate 1 dehydrogenase.

More specifically the present strain is recognized as having a geneencoding an transaldolase having about 78% identity at the amino acidlevel over length of 328 amino acids using a Smith-Waterman alignmentalgorithm (W. R. Pearson, Comput. Methods Genome Res., [Proc. Int.Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor.Publisher: Plenum, New York, N.Y.) to the sequence set forth in SEQ IDNO:2. More preferred amino acid fragments are at least about 80%-90%identical to the sequences herein. Most preferred are amino acidfragments that are at least 95% identical to the amino acid fragmentsreported herein. Similarly, preferred transaldolase encoding nucleicacid sequences corresponding to the instant seqeunces are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences of reported herein. More preferred transaldolase nucleic acidfragments are at least 90% identical to the sequences herein. Mostpreferred are transaldolase nucleic acid fragments that are at least 95%identical to the nucleic acid fragments reported herein.

More specifically the present strain is recognized as having a geneencoding an transaldolase having about 50% identity at the amino acidlevel over length of 160 amino acids using a Smith-Waterman alignmentalgorithm (W. R. Pearson, supra) to the sequence set forth in SEQ IDNO:4. More preferred amino acid fragments are at least about 80%-90%identical to the sequences herein. Most preferred are amino acidfragments that are at least 95% identical to the amino acid fragmentsreported herein. Similarly, preferred transaldolase encoding nucleicacid sequences corresponding to the instant seqeunces are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences of reported herein. More preferred transaldolase nucleic acidfragments are at least 90% identical to the sequences herein. Mostpreferred are transaldolase nucleic acid fragments that are at least 95%identical to the nucleic acid fragments reported herein.

Additionally the present strain is recognized as having a gene encodingan FBP aldolase having about 76% identity at the amino acid level overlength of 335 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra) to the sequence set forth in SEQ ID NO:6. Morepreferred amino acid fragments are at least about 80%-90% identical tothe sequences herein. Most preferred are amino acid fragments that areat least 95% identical to the amino acid fragments reported herein.Similarly, preferred FBP aldolase encoding nucleic acid sequencescorresponding to the instant seqeunces are those encoding activeproteins and which are at least 80% identical to the nucleic acidsequences of reported herein. More preferred FBP aldolase nucleic acidfragments are at least 90% identical to the sequences herein. Mostpreferred are FBP aldolase nucleic acid fragments that are at least 95%identical to the nucleic acid fragments reported herein.

Additionally the present strain is recognized as having a gene encodingan FBP aldolase having about 40% identity at the amino acid level overlength of 358 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra) to the sequence set forth in SEQ ID NO:8. Morepreferred amino acid fragments are at least about 80%-90% identical tothe sequences herein. Most preferred are amino acid fragments that areat least 95% identical to the amino acid fragments reported herein.Similarly, preferred FBP aldolase encoding nucleic acid sequencescorresponding to the instant seqeunces are those encoding activeproteins and which are at least 80% identical to the nucleic acidsequences of reported herein. More preferred FBP aldolase nucleic acidfragments are at least 90% identical to the sequences herein. Mostpreferred are FBP aldolase nucleic acid fragments that are at least 95%identical to the nucleic acid fragments reported herein.

Additionally the present strain is recognized as having a gene encodingan KDPG aldolase having about 59% identity at the amino acid level overlength of 212 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra) to the sequence set forth in SEQ ID NO:10. Morepreferred amino acid fragments are at least about 80%-90% identical tothe sequences herein. Most preferred are amino acid fragments that areat least 95% identical to the amino acid fragments reported herein.Similarly, preferred KDPG aldolase encoding nucleic acid sequencescorresponding to the instant seqeunces are those encoding activeproteins and which are at least 80% identical to the nucleic acidsequences of reported herein. More preferred KDPG aldolase nucleic acidfragments are at least 90% identical to the sequences herein. Mostpreferred are KDPG aldolase nucleic acid fragments that are at least 95%identical to the nucleic acid fragments reported herein.

Additionally the present strain is recognized as having a gene encodingan phosphoglucomutase having about 65% identity at the amino acid levelover length of 545 amino acids using a Smith-Waterman alignmentalgorithm (W. R. Pearson, supra) to the sequence set forth in SEQ IDNO:12. More preferred amino acid fragments are at least about 80%-90%identical to the sequences herein. Most preferred are amino acidfragments that are at least 95% identical to the amino acid fragmentsreported herein. Similarly, preferred phosphoglucomutase encodingnucleic acid sequences corresponding to the instant seqeunces are thoseencoding active proteins and which are at least 80% identical to thenucleic acid sequences of reported herein. More preferredphosphoglucomutase nucleic acid fragments are at least 90% identical tothe sequences herein. Most preferred are phosphoglucomutase nucleic acidfragments that are at least 95% identical to the nucleic acid fragmentsreported herein.

Additionally the present strain is recognized as having a gene encodingan glucose-6-phosphate isomerase having about 64% identity at the aminoacid level over length of 592 amino acids using a Smith-Watermanalignment algorithm (W. R. Pearson, supra) to the sequence set forth inSEQ ID NO:14. More preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are amino acidfragments that are at least 95% identical to the amino acid fragmentsreported herein. Similarly, preferred glucose-6-phosphate isomeraseencoding nucleic acid sequences corresponding to the instant seqeuncesare those encoding active proteins and which are at least 80% identicalto the nucleic acid sequences of reported herein. More preferredglucose-6-phosphate isomerase nucleic acid fragments are at least 90%identical to the sequences herein. Most preferred areglucose-6-phosphate isomerase nucleic acid fragments that are at least95% identical to the nucleic acid fragments reported herein.

Additionally the present strain is recognized as having a gene encodingan phosphofructokinase having about 63% identity at the amino acid levelover length of 437 amino acids using a Smith-Waterman alignmentalgorithm (W. R. Pearson, supra) to the sequence set forth in SEQ IDNO:16. More preferred amino acid fragments are at least about 80%-90%identical to the sequences herein. Most preferred are amino acidfragments that are at least 95% identical to the amino acid fragmentsreported herein. Similarly, preferred phosphofructokinase encodingnucleic acid sequences corresponding to the instant seqeunces are thoseencoding active proteins and which are at least 80% identical to thenucleic acid sequences of reported herein. More preferredphosphofructokinase nucleic acid fragments are at least 90% identical tothe sequences herein. Most preferred are phosphofructokinase nucleicacid fragments that are at least 95% identical to the nucleic acidfragments reported herein.

Additionally the present strain is recognized as having a gene encodingan 6-phosphogluconate dehydratase having about 60% identity at the aminoacid level over length of 618 amino acids using a Smith-Watermanalignment algorithm (W. R. Pearson, supra) to the sequence set forth inSEQ ID NO:18. More preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are amino acidfragments that are at least 95% identical to the amino acid fragmentsreported herein. Similarly, preferred 6-phosphogluconate dehydrataseencoding nucleic acid sequences corresponding to the instant seqeuncesare those encoding active proteins and which are at least 80% identicalto the nucleic acid sequences of reported herein. More preferred6-phosphogluconate dehydratase nucleic acid fragments are at least 90%identical to the sequences herein. Most preferred are 6-phosphogluconatedehydratase nucleic acid fragments that are at least 95% identical tothe nucleic acid fragments reported herein.

Additionally the present strain is recognized as having a gene encodingan encoding a 6-phosphogluconate-6-phosphate-1-dehydrogenase havingabout 58% identity at the amino acid level over length of 501 aminoacids using a Smith-Waterman alignment algorithm (W. R. Pearson, supra)to the sequence set forth in SEQ ID NO:20. More preferred amino acidfragments are at least about 80%-90% identical to the sequences herein.Most preferred are amino acid fragments that are at least 95% identicalto the amino acid fragments reported herein. Similarly, preferred6-phosphogluconate-6-phosphate-1-dehydrogenase encoding nucleic acidsequences corresponding to the instant seqeunces are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences of reported herein. More preferred6-phosphogluconate-6-phosphate-1-dehydrogenase nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred are6-phosphogluconate-6-phosphate-1-dehydrogenase nucleic acid fragmentsthat are at least 95% identical to the nucleic acid fragments reportedherein.

Isolation of Homologs

The nucleic acid fragments of the instant invention may be used toisolate genes encoding homologous proteins from the same or othermicrobial species. Isolation of homologous genes usingsequence-dependent protocols is well known in the art. Examples ofsequence-dependent protocols include, but are not limited to, methods ofnucleic acid hybridization, and methods of DNA and RNA amplification asexemplified by various uses of nucleic acid amplification technologies(e.g. polymerase chain reaction (PCR), Mullis et al., U.S. Pat. No.4,683,202), ligase chain reaction (LCR), Tabor, S. et al., Proc. Acad.Sci. USA 82, 1074, (1985)) or strand displacement amplification (SDA,Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89, 392, (1992)).

For example, genes encoding similar proteins or polypetides to those ofthe instant invention could be isolated directly by using all or aportion of the instant nucleic acid fragments as DNA hybridizationprobes to screen libraries from any desired bacteria using methodologywell known to those skilled in the art. Specific oligonucleotide probesbased upon the instant nucleic acid sequences can be designed andsynthesized by methods known in the art (Maniatis). Moreover, the entiresequences can be used directly to synthesize DNA probes by methods knownto the skilled artisan such as random primers DNA labeling, nicktranslation, or end-labeling techniques, or RNA probes using availablein vitro transcription systems. In addition, specific primers can bedesigned and used to amplify a part of or full-length of the instantsequences. The resulting amplification products can be labeled directlyduring amplification reactions or labeled after amplification reactions,and used as probes to isolate full length DNA fragments under conditionsof appropriate stringency.

Typically, in PCR-type amplification techniques, the primers havedifferent sequences and are not complementary to each other. Dependingon the desired test conditions, the sequences of the primers should bedesigned to provide for both efficient and faithful replication of thetarget nucleic acid. Methods of PCR primer design are common and wellknown in the art (Thein and Wallace, “The use of oligonucleotide asspecific hybridization probes in the Diagnosis of Genetic Disorders”, inHuman Genetic Diseases: A Practical Approach, K. E. Davis Ed., (1986)pp. 33-50 IRL Press, Herndon, Va.; Rychlik, W. (1993) In White, B. A.(ed.), Methods in Molecular Biology, Vol. 15, pages 31-39, PCRProtocols: Current Methods and Applications. Humania Press, Inc.,Totowa, N.J.).

Generally two short segments of the instant sequences may be used inpolymerase chain reaction protocols to amplify longer nucleic acidfragments encoding homologous genes from DNA or RNA. The polymerasechain reaction may also be performed on a library of cloned nucleic acidfragments wherein the sequence of one primer is derived from the instantnucleic acid fragments, and the sequence of the other primer takesadvantage of the presence of the polyadenylic acid tracts to the 3′ endof the mRNA precursor encoding microbial genes. Alternatively, thesecond primer sequence may be based upon sequences derived from thecloning vector. For example, the skilled artisan can follow the RACEprotocol (Frohman et al., PNAS USA 85:8998 (1988)) to generate cDNAs byusing PCR to amplify copies of the region between a single point in thetranscript and the 3′ or 5′ end. Primers oriented in the 3′ and 5′directions can be designed from the instant sequences. Usingcommercially available 3′ RACE or 5′ RACE systems (BRL), specific 3′ or5′ cDNA fragments can be isolated (Ohara et al., PNAS USA 86:5673(1989); Loh et al., Science 243:217 (1989)).

Alternatively the instant sequences may be employed as hybridizationreagents for the identification of homologs. The basic components of anucleic acid hybridization test include a probe, a sample suspected ofcontaining the gene or gene fragment of interest, and a specifichybridization method. Probes of the present invention are typicallysingle stranded nucleic acid sequences which are complementary to thenucleic acid sequences to be detected. Probes are “hybridizable” to thenucleic acid sequence to be detected. The probe length can vary from 5bases to tens of thousands of bases, and will depend upon the specifictest to be done. Typically a probe length of about 15 bases to about 30bases is suitable. Only part of the probe molecule need be complementaryto the nucleic acid sequence to be detected. In addition, thecomplementarity between the probe and the target sequence need not beperfect. Hybridization does occur between imperfectly complementarymolecules with the result that a certain fraction of the bases in thehybridized region are not paired with the proper complementary base.

Hybridization methods are well defined. Typically the probe and samplemust be mixed under conditions which will permit nucleic acidhybridization. This involves contacting the probe and sample in thepresence of an inorganic or organic salt under the proper concentrationand temperature conditions. The probe and sample nucleic acids must bein contact for a long enough time that any possible hybridizationbetween the probe and sample nucleic acid may occur. The concentrationof probe or target in the mixture will determine the time necessary forhybridization to occur. The higher the probe or target concentration theshorter the hybridization incubation time needed. Optionally achaotropic agent may be added. The chaotropic agent stabilizes nucleicacids by inhibiting nuclease activity. Furthermore, the chaotropic agentallows sensitive and stringent hybridization of short oligonucleotideprobes at room temperature [Van Ness and Chen (1991) Nucl. Acids Res.19:5143-5151]. Suitable chaotropic agents include guanidinium chloride,guanidinium thiocyanate, sodium thiocyanate, lithium tetrachloroacetate,sodium perchlorate, rubidium tetrachloroacetate, potassium iodide, andcesium trifluoroacetate, among others. Typically, the chaotropic agentwill be present at a final concentration of about 3M. If desired, onecan add formamide to the hybridization mixture, typically 30-50% (v/v).

Various hybridization solutions can be employed. Typically, thesecomprise from about 20 to 60% volume, preferably 30%, of a polar organicsolvent. A common hybridization solution employs about 30-50% v/vformamide, about 0.15 to 1M sodium chloride, about 0.05 to 0.1M buffers,such as sodium citrate, Tris-HCl, PIPES or HEPES (pH range about 6-9),about 0.05 to 0.2% detergent, such as sodium dodecylsulfate, or between0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kilodaltons),polyvinylpyrrolidone (about 250-500 kdal), and serum albumin. Alsoincluded in the typical hybridization solution will be unlabeled carriernucleic acids from about 0.1 to 5 mg/mL, fragmented nucleic DNA, e.g.,calf thymus or salmon sperm DNA, or yeast RNA, and optionally from about0.5 to 2% wt./vol. glycine. Other additives may also be included, suchas volume exclusion agents which include a variety of polarwater-soluble or swellable agents, such as polyethylene glycol, anionicpolymers such as polyacrylate or polymethylacrylate, and anionicsaccharidic polymers, such as dextran sulfate.

Nucleic acid hybridization is adaptable to a variety of assay formats.One of the most suitable is the sandwich assay format. The sandwichassay is particularly adaptable to hybridization under non-denaturingconditions. A primary component of a sandwich-type assay is a solidsupport. The solid support has adsorbed to it or covalently coupled toit immobilized nucleic acid probe that is unlabeled and complementary toone portion of the sequence.

Recombinant Expression—Microbial

The genes and gene products of the instant sequences may be produced inheterologous host cells, particularly in the cells of microbial hosts.Expression in recombinant microbial hosts may be useful for theexpression of various pathway intermediates, for the modulation ofpathways already existing in the host, and for the synthesis of newproducts heretofore not possible using the host. Additionally, the geneproducts may be useful for conferring higher growth yields of the hostor for enabling alternative growth modes to be utilized.

Preferred heterologous host cells for expression of the instant genesand nucleic acid molecules are microbial hosts that can be found broadlywithin microbial families and which grow over a wide range oftemperatures, pH values, and solvent tolerances. Such microbes willinclude generally bacteria, yeast, and filamentous fungi. Specifically,suitable yeasts and fungi will include, but are not limited to,Aspergillus, Saccharomyces, Pichia, Candida, and Hansenula. Suitablebacterial species include, but are not limited to, Salmonella, Bacillus,Acinetobacter, Rhodococcus, Streptomyces, Escherichia, and Pseudomonas.Most preferred hosts for the expression of the present carbon flux genesare members of the methanotrophic class of bacteria includingMethylomonas, Methylococcus and Methylobacter. Particularly suited fortransformation will be members of the genus Methylomonas. Thesebacterial species have the ability to convert single carbon substratessuch as methane and methanol to useful products and these genes areparticularly suited for substrates found in these hosts.

Of particular interest in the present invention are high growth obligatemethanotrophs having an energetically favorable carbon flux pathway. Forexample, Applicants have discovered a specific strain of methanotrophhaving several pathway features which make it particularly useful forcarbon flux manipulation. This type of strain has served as the host inthe present application and is known as Methylomonas 16a (ATCC PTA2402).

The present strain contains several anomalies in the carbon utilizationpathway. For example, based on genome sequence data, the strain is shownto contain genes for two pathways of hexose metabolism. TheEntner-Douderoff pathway which utilizes the keto-deoxy phosphogluconatealdolase enzyme is present in the strain. It is generally well acceptedthat this is the operative pathway in obligate methanotrophs. Alsopresent, however, is the Embden-Meyerhof pathway which utilizes thefructose bisphosphate aldolase enzyme. It is well known that thispathway is either not present or not operative in obligatemethanotrophs. Energetically, the latter pathway is most favorable andallows greater yield of biologically useful energy and ultimatelyproduction of cell mass and other cell mass-dependent products inMethylomonas 16a. The activity of this pathway in the present 16a strainhas been confirmed through microarray data and biochemical evidencemeasuring the reduction of ATP. Although the 16a strain has been shownto possess both the Embden-Meyerhof and the Entner-Douderoff pathwayenzymes, the data suggests that the Embden-Meyerhof pathway enzymes aremore strongly expressed than the Entner-Douderoff pathway enzymes. Thisresult is surprising and counter to existing beliefs concerning theglycolytic metabolism of methanotrophic bacteria. Applicants havediscovered other methanotrophic bacteria having this characteristic,including for example, Methylomonas clara and Methylosinus sporium. Itis likely that this activity has remained undiscovered in methanotrophsdue to the lack of activity of the enzyme with ATP, the typicalphosphoryl donor for the enzyme in most bacterial systems.

A particularly novel and useful feature of the Embden-Meyerhof pathwayin strain 16a is that the key phosphofructokinase step is pyrophosphatedependent instead of ATP dependent. This feature adds to the energyyield of the pathway by using pyrophosphate instead of ATP. Because ofits significance in providing an energetic advantage to the strain thisgene in the carbon flux pathway is considered diagnostic for the presentstrain.

Comparison of the pyrophosphate dependent phosphofructokinase genesequence (SEQ ID NO: 15) and deduced amino acid sequence (SEQ ID NO:16)to public databases reveals that the most similar known sequences areabout 63% identical to the amino acid sequence reported herein over alength of 437 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994),Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Publisher: Plenum,New York, N.Y.). More preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred pyrophosphate dependentphosphofructokinase encoding nucleic acid sequences corresponding to theinstant gene are those encoding active proteins and which are at least80% identical to the nucleic acid sequences of reported herein. Morepreferred pyrophosphate dependent phosphofructokinase nucleic acidfragments are at least 90% identical to the sequences herein. Mostpreferred are pyrophosphate dependent phosphofructokinase nucleic acidfragments that are at least 95% identical to the nucleic acid fragmentsreported herein.

In methanotrophic bacteria methane is converted to biomolecules via acyclic set of reactions known as the ribulose monophosphate pathway orRuMP cycle. This pathway is comprised of three phases, each phase beinga series of enzymatic steps (FIG. 1). The first step is “fixation” orincorporation of C-1 (formaldehyde) into a pentose to form a hexose orsix-carbon sugar. This occurs via a condensation reaction between a5-carbon sugar (pentose) and formaldehyde and is catalyzed by hexulosemonophosphate synthase. The second phase is termed “cleavage” andresults in splitting of that hexose into two 3-carbon molecules. One ofthose three-carbon molecules is recycled back through the RuMP pathwayand the other 3-carbon fragment is utilized for cell growth. Inmethanotrophs and methylotrophs the RuMP pathway may occur as one ofthree variants. However, only two of these variants are commonly found:the FBP/TA (fructose bisphosphotase/Transaldolase) or the KDPG/TA (ketodeoxy phosphogluconate/transaldolase) pathway (Dijkhuizen L., G. E.Devries. The Physiology and biochemistry of aerobic methanol-utilizinggram negative and gram positive bacteria. In: Methane and MethanolUtilizers 1992, ed Colin Murrell and Howard Dalton Plenum Press NY).

The present strain is unique in the way it handles the “cleavage” stepswhere genes were found that carry out this conversion via fructosebisphosphate as a key intermediate. The genes for fructose bisphosphatealdolase and transaldolase were found clustered together on one piece ofDNA. Secondly, the genes for the other variant involving the keto deoxyphosphogluconate intermediate were also found clustered together.Available literature teaches that these organisms (obligatemethylotrophs and methanotrophs) rely solely on the KDPG pathway andthat the FBP-dependent fixation pathway is utilized by facultativemethylotrophs (Dijkhuizen et al., supra). Therefore the latterobservation is expected whereas the former is not. The finding of theFBP genes in an obligate methane utilizing bacterium is both surprisingand suggestive of utility. The FBP pathway is energetically favorable tothe host microorganism due to the fact that more energy (ATP) isutilized than is utilized in the KDPG pathway. Thus organisms thatutilize the FBP pathway may have an energetic advantage and growthadvantage over those that utilize the KDPG pathway. This advantage mayalso be useful for energy-requiring production pathways in the strain.By using this pathway, a methane-utilizing bacterium may have anadvantage over other methane utilizing organisms as production platformsfor either single cell protein or for any other product derived from theflow of carbon through the RuMP pathway.

Accordingly the present invention provides a method for altering carbonflux in a high growth, energetically favorable Methylomonas strain which

-   -   (a) grows on a C1 carbon substrate selected from the group        consisting of methane and methanol; and    -   (b) comprises a functional Embden-Meyerhof carbon pathway, said        pathway comprising a gene encoding a pyrophosphate dependent        phosphofructokinase enzyme.

Microbial expression systems and expression vectors containingregulatory sequences that direct high level expression of foreign genesare well known to those skilled in the art. Any of these could be usedto construct chimeric genes for expression of the any of the presentgenes. These chimeric genes could then be introduced into appropriatemicroorganisms via transformation to provide recombinant expression ofthe enzymes and manipulation of the carbon pathways.

Vectors or cassettes useful for the transformation of suitable hostcells are well known in the art. Typically the vector or cassettecontains sequences directing transcription and translation of therelevant gene, a selectable marker, and sequences allowing autonomousreplication or chromosomal integration. Suitable vectors comprise aregion 5′ of the gene which harbors transcriptional initiation controlsand a region 3′ of the DNA fragment which controls transcriptionaltermination. It is most preferred when both control regions are derivedfrom genes homologous to the transformed host cell, although it is to beunderstood that such control regions need not be derived from the genesnative to the specific species chosen as a production host.

Initiation control regions or promoters, which are useful to driveexpression of the instant ORF's in the desired host cell are numerousand familiar to those skilled in the art. Virtually any promoter capableof driving these genes is suitable for the present invention includingbut not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH,ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression inSaccharomyces); AOX1 (useful for expression in Pichia); and lac, ara,tet, trp, IP_(L), IP_(R), T7, tac, and trc (useful for expression inEscherichia coli) as well as the amy, apr, npr promoters and variousphage promoters useful for expression in Bacillus.

Termination control regions may also be derived from various genesnative to the preferred hosts. Optionally, a termination site may beunnecessary, however, it is most preferred if included.

Pathway Engineering

The present genes may be used to affect carbon flow in bacteria andspecifically methanotrophic bacteria. Commercial applications of themethanotrops have revolved around the production of single cell protein(Villadsen, John, Recent Trends Chem. React. Eng., [Proc. Int. Chem.React. Eng. Conf.], 2nd (1987), Volume 2, 320-33. Editor(s): Kulkarni,B. D.; Mashelkar, R. A.; Sharma, M. M. Publisher: Wiley East, New Delhi,India; Naguib, M., Proc. OAPEC Symp. Petroprotein, [Pap.] (1980),Meeting Date 1979, 253-77, Publisher: Organ. Arab Pet. ExportingCountries, Kuwait, Kuwait) and the epoxidation of alkenes for productionof chemicals (U.S. Pat. No. 4,348,476). These C1 substrate utilizingbacteria also are known to produce polysaccharides, used as thickenersin food and non-food industries, and isoprenoid compounds and carotenoidpigments of various carbon lengths (Urakami et al., J. Gen. Appl.Microbiol. (1986), 32(4), 317-41). The production of all of thesecommercially useful products will be impacted by alterations in carbonflux, in general, and by manipulation of the present genes, inparticular. Such manipulation may be effected by the up- ordown-regulation of various members of the carbon flux pathway.

Many of the key genes in the carbon utilization pathway are nowdisclosed in the present invention. Referring to FIG. 1, for example,the present invention provides genes encoding two distinct carbon fluxpathways isolated from a methanotrophic bacteria. The genes and geneproducts are set forth in SEQ ID NO:1-SEQ ID NO:20, and encode both aKDPG aldolase and a FBP aldolase as well as a phosphoglucomutase,pyrophosphate dependent phosphofructokinase pyrophosphate,6-phosphogluconate dehydratase, and a glucose 6 phosphate 1dehydrogenase. The phosphoglucomutase is responsible for theinterconversion of glucose-6-phosphate to glucose-1-phosphate, whichfeeds into either the Entner douderoff or Embden-Meyerhof carbon fluxpathways. As shown in FIG. 1, fructose-6-phosphate may be converted toeither glucose-6-phosphase by glucose phophate isomerase(Entner-Douderoff) or to fructose-1,6-bisphosphate (FBP) by aphosphofructokinase (Embden-Meyerhof). Following the Embden-Meyerhofpathway, FBP is then taken to two three-carbon moieties,dihydroxyacetone and 3-phosphoglyceraldehyde by the FBP aldolase.Returning to the Entner-Douderoff pathway, glucose-6-phosphate is takento 6-phosphogluconate by a glucose-6-phosphate dehydrogenase which issubsequently taken to 2-keto-3-deoxy-6-phosphogluconate (KDPG) by a 6phosphogluconate dehydratase. The KDPG is then converted to twothree-carbon moieties (pyruvate and 3-phosphoglyceraldehyde) by a KDPGaldolase. Thus, the Embden-Meyerhof and Entner-Douderoff pathways arerejoined at the level of 3-phosphoglyceraldehyde. Manipulations in anyone or all of these genes may be used for commercial advantage in theproduction of materials from a variety of bacteria and most suitablyfrom methanotrophic bacteria.

Methods of manipulating genetic pathways are common and well known inthe art. Selected genes in a particularly pathway may be upregulated ordown regulated by variety of methods. Additionally, competing pathwaysin the organism may be eliminated or sublimated by gene disruption andsimilar techniques.

Once a key genetic pathway has been identified and sequenced, specificgenes may be upregulated to increase the output of the pathway. Forexample, additionally copies of the targeted genes may be introducedinto the host cell on multicopy plasmids such as pBR322. Alternativelythe target genes may be modified so as to be under the control ofnon-native promoters. Where it is desired that a pathway operate at aparticular point in a cell cycle or during a fermentation run, regulatedor inducible promoters may used to replace the native promoter of thetarget gene. Similarly, in some cases the native or endogenous promotermay be modified to increase gene expression. For example, endogenouspromoters can be altered in vivo by mutation, deletion, and/orsubstitution (see, Kmiec, U.S. Pat. No. 5,565,350; Zarling et al.,PCT/US93/03868).

Alternatively it may be necessary to reduce or eliminate the expressionof certain genes in the target pathway or in competing pathways that mayserve as competing sinks for energy or carbon. Methods ofdown-regulating genes for this purpose have been explored. Wheresequence of the gene to be disrupted is known, one of the most effectivemethods gene down regulation is targeted gene disruption where foreignDNA is inserted into a structural gene so as to disrupt transcription.This can be effected by the creation of genetic cassettes comprising theDNA to be inserted (often a genetic marker) flanked by sequence having ahigh degree of homology to a portion of the gene to be disrupted.Introduction of the cassette into the host cell results in insertion ofthe foreign DNA into the structural gene via the native DNA replicationmechanisms of the cell. (See for example Hamilton et al. (1989) J.Bacteriol. 171:4617-4622, Balbas et al. (1993) Gene 136:211-213,Gueldener et al. (1996) Nucleic Acids Res. 24:2519-2524, and Smith etal. (1996) Methods Mol. Cell. Biol. 5:270-277.)

Antisense technology is another method of down regulating genes wherethe sequence of the target gene is known. To accomplish this, a nucleicacid segment from the desired gene is cloned and operably linked to apromoter such that the anti-sense strand of RNA will be transcribed.This construct is then introduced into the host cell and the antisensestrand of RNA is produced. Antisense RNA inhibits gene expression bypreventing the accumulation of mRNA which encodes the protein ofinterest. The person skilled in the art will know that specialconsiderations are associated with the use of antisense technologies inorder to reduce expression of particular genes. For example, the properlevel of expression of antisense genes may require the use of differentchimeric genes utilizing different regulatory elements known to theskilled artisan.

Although targeted gene disruption and antisense technology offereffective means of down regulating genes where the sequence is known,other less specific methodologies have been developed that are notsequence based. For example, cells may be exposed to a UV radiation andthen screened for the desired phenotype. Mutagenesis with chemicalagents is also effective for generating mutants and commonly usedsubstances include chemicals that affect nonreplicating DNA such as HNO₂and NH₂OH, as well as agents that affect replicating DNA such asacridine dyes, notable for causing frameshift mutations. Specificmethods for creating mutants using radiation or chemical agents are welldocumented in the art. See for example Thomas D. Brock in Biotechnology:A Textbook of Industrial Microbiology, Second Edition (1989) SinauerAssociates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl.Biochem. Biotechnol., 36, 227, (1992).

Another non-specific method of gene disruption is the use oftransposoable elements or transposons. Transposons are genetic elementsthat insert randomly in DNA but can be lafter retrieved on the basis ofsequence to determine where the insertion has occurred. Both in vivo andin vitro transposition methods are known. Both methods involve the useof a transposable element in combination with a transposase enzyme. Whenthe transposable element or transposon, is contacted with a nucleic acidfragment in the presence of the transposase, the transposable elementwill randomly insert into the nucleic acid fragment. The technique isuseful for random mutageneis and for gene isolation, since the disruptedgene may be identified on the basis of the sequence of the transposableelement. Kits for in vitro transposition are commercially available (seefor example The Primer Island Transposition Kit, available from PerkinElmer Applied Biosystems, Branchburg, N.J., based upon the yeast Tylelement; The Genome Priming System, available from New England Biolabs,Beverly, Mass.; based upon the bacterial transposon Tn7; and the EZ::TNTransposon Insertion Systems, available from Epicentre Technologies,Madison, Wis., based upon the Tn5 bacterial transposable element.

Within the context of the present invention it may be useful to modulatethe expression of the carbon flux pathway. It is apparent from the knownpathways in methanotrophic bacteria that there can be utility in eitherthe FBP/TA or KDGP/TA pathway, depending on the target product. TheFBP/TA pathway is more energy-yielding and thus is advantageous from thestandpoint of producing more cellular mass per unit of methanemetabolized. Thus if the strain is forced to utilize this pathway via agene knock-out of the KDGP/TA pathway, it is anticipated that greatercell mass will be produced. In addition, the production of chemicalsthat have a high energy requirement for biosynthesis in the form of ATPmay also be enhanced by deletion or mutation of the KDGP/TA pathway.Chemical production requiring pyruvate as a key intermediate, however,might benefit from the deletion or knock-out of the FBP/TA pathwaygenes. As an integral part of the Methylomonas production platform it isdesirable to have the capability to utilize either pathway viaintroduction of specialized regulatory gene promoters that will enableeither pathway to be switched on or off in the presence of chemicalsthat could be added to the fermentation.

More specifically, it has been noted that the present Methylomonas 16acomprises genes encoding both the Entner-Douderoff and Embden-Meyerhofcarbon flux pathways. Because the Embden-Meyerhof pathway is more energyefficient it may be desirable to over-express the genes in this pathway.Additionally, it is likely that the Entner-Douderoff pathway is acompetitive pathway and inhibition of this pathway may lead to increasedenergy efficiency in the Embden-Meyerhof system. This might beaccomplished by selectively using the above described methods of genedown regulation on the sequence encoding the keto-deoxy phosphogluconatealdolase (SEQ ID NO: 9) or any of the other members of theEntner-Ddouderoff system and upregulating the gene encoding the fructosebisphosphatase aldolase of the Embden-Meyerhof system (SEQ ID NO:5 OR7). In this fashion, the carbon flux in the present Methylomonas 16a maybe optimized. Additionally, where the present strain has been engineeredto produce specific organic materials such as aromatics for monomerproduction, optimization of the carbon flux pathway will lead toincreased yields of these materials.

Industrial Scale Production

Where the engineering of a commercial bacterial production platformcomprising the present genes is desired, a variety of culturemethodologies may be applied. For example, large scale production of aspecific product or products from a recombinant microbial host may beproduced by both batch or continuous culture methodologies

A classical batch culturing method is a closed system where thecomposition of the media is set at the beginning of the culture and notsubject to artificial alterations during the culturing process. Thus, atthe beginning of the culturing process the media is inoculated with thedesired organism or organisms and growth or metabolic activity ispermitted to occur adding nothing to the system. Typically, however, a“batch” culture is batch with respect to the addition of carbon sourceand attempts are often made at controlling factors such as pH and oxygenconcentration. In batch systems the metabolite and biomass compositionsof the system change constantly up to the time the culture isterminated. Within batch cultures cells moderate through a static lagphase to a high growth log phase and finally to a stationary phase wheregrowth rate is diminished or halted. If untreated, cells in thestationary phase will eventually die. Cells in log phase are oftenresponsible for the bulk of production of end product or intermediate insome systems. Stationary or post-exponential phase production can beobtained in other systems.

A variation on the standard batch system is the Fed-Batch system.Fed-Batch culture processes are also suitable in the present inventionand comprise a typical batch system with the exception that thesubstrate is added in increments as the culture progresses. Fed-Batchsystems are useful when catabolite repression is apt to inhibit themetabolism of the cells and where it is desirable to have limitedamounts of substrate in the media. Measurement of the actual substrateconcentration in Fed-Batch systems is difficult and is thereforeestimated on the basis of the changes of measurable factors such as pH,dissolved oxygen and the partial pressure of waste gases such as CO₂.Batch and Fed-Batch culturing methods are common and well known in theart and examples may be found in Thomas D. Brock in Biotechnology: ATextbook of Industrial Microbiology, Second Edition (1989) SinauerAssociates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl.Biochem. Biotechnol., 36, 227, (1992), herein incorporated by reference.

Commercial use of the instant gene pathways may also be accomplishedwith a continuous culture. Continuous cultures are an open system wherea defined culture media is added continuously to a bioreactor and anequal amount of conditioned media is removed simultaneously forprocessing. Continuous cultures generally maintain the cells at aconstant high liquid phase density where cells are primarily in logphase growth. Alternatively continuous culture may be practiced withimmobilized cells where carbon and nutrients are continuously added andvaluable products, by-products or waste products continuously removedfrom the cell mass. Cell immobilization may be performed using a widerange of solid supports composed of natural and/or synthetic materials.

Continuous or semi-continuous culture allows for the modulation of onefactor or any number of factors that affect cell growth or end productconcentration. For example, one method will maintain a limiting nutrientsuch as the carbon source or nitrogen level at a fixed rate and allowall other parameters to moderate. In other systems a number of factorsaffecting growth can be altered continuously while the cellconcentration, measured by media turbidity, is kept constant. Continuoussystems strive to maintain steady state growth conditions and thus thecell loss due to media being drawn off must be balanced against the cellgrowth rate in the culture. Methods of modulating nutrients and growthfactors for continuous culture processes as well as techniques formaximizing the rate of product formation are well known in the art ofindustrial microbiology and a variety of methods are detailed by Brock,supra.

Protein Engineering

It is contemplated that the present nucleotide sequences may be used toproduce gene products having enhanced or altered activity. Variousmethods are known for mutating a native gene sequence to produce a geneproduct with altered or enhanced activity including but not limited toerror prone PCR (Melnikov et al., Nucleic Acids Research, (Feb. 15,1999) Vol. 27, No. 4, pp.1056-1062); site directed mutagenesis (Coombset al., Proteins (1998), 259-311, 1 plate. Editor(s): Angeletti, RuthHogue. Publisher: Academic, San Diego, Calif.) and “gene shuffling”(U.S. Pat. Nos. 5,605,793; 5,811,238; 5,830,721; and 5,837,458,incorporated herein by reference).

The method of gene shuffling is particularly attractive due to itsfacile implementation, and high rate of mutagenesis and ease ofscreening. The process of gene shuffling involves the restrictionendonuclease cleavage of a gene of interest into fragments of specificsize in the presence of additional populations of DNA regions of bothsimilarity to or difference to the gene of interest. This pool offragments will then be denatured and reannealed to create a mutatedgene. The mutated gene is then screened for altered activity.

The carbon flux sequences of the present invention may be mutated andscreened for altered or enhanced activity by this method. The sequencesshould be double stranded and can be of various lengths ranging form 50bp to 10 kb. The sequences may be randomly digested into fragmentsranging from about 10 bp to 1000 bp, using restriction endonucleaseswell known in the art (Maniatis supra). In addition to the instantmicrobial sequences, populations of fragments that are hybridizable toall or portions of the microbial sequence may be added. Similarly, apopulation of fragments which are not hybridizable to the instantsequence may also be added. Typically these additional fragmentpopulations are added in about a 10 to 20 fold excess by weight ascompared to the total nucleic acid. Generally if this process isfollowed the number of different specific nucleic acid fragments in themixture will be about 100 to about 1000. The mixed population of randomnucleic acid fragments are denatured to form single-stranded nucleicacid fragments and then reannealed. Only those single-stranded nucleicacid fragments having regions of homology with other single-strandednucleic acid fragments will reanneal. The random nucleic acid fragmentsmay be denatured by heating. One skilled in the art could determine theconditions necessary to completely denature the double stranded nucleicacid. Preferably the temperature is from 80° C. to 100° C. The nucleicacid fragments may be reannealed by cooling. Preferably the temperatureis from 20° C. to 75° C. Renaturation can be accelerated by the additionof polyethylene glycol (“PEG”) or salt. A suitable salt concentrationmay range from 0 mM to 200 mM. The annealed nucleic acid fragments arenext incubated in the presence of a nucleic acid polymerase and dNTP's(i.e. dATP, dCTP, dGTP and dTTP). The nucleic acid polymerase may be theKlenow fragment, the Taq polymerase or any other DNA polymerase known inthe art. The polymerase may be added to the random nucleic acidfragments prior to annealing, simultaneously with annealing or afterannealing. The cycle of denaturation, renaturation and incubation in thepresence of polymerase is repeated for a desired number of times.Preferably the cycle is repeated from 2 to 50 times, more preferably thesequence is repeated from 10 to 40 times. The resulting nucleic acid isa larger double-stranded polynucleotide of from about 50 bp to about 100kb and may be screened for expression and altered activity by standardcloning and expression protocol. (Maniatis supra).

Gene Expression Profiling

The present carbon flux genes may be used in connection with geneexpression profiling technology for metabolic characterization of thecell from which the genes came. For example, many external changes suchas changes in growth condition or exposure to chemicals can causeinduction or repression of genes in the cell. The induction orrepression of genes can be used for a screening system to determine thebest growth conditions for a production organism and drug discovery withsimilar mode of action compound, just to mention a few. On the otherhand, by amplifying or disrupting genes, one can manipulate theproduction of the amount of cellular products as well as the timelineupon which those products are produced. All or a portion of the presentnucleic acid fragments of the instant invention may be used as probesfor gene expression monitoring and gene expression profiling.

For example, all or a portion of the instant nucleic acid fragments maybe immobilized on a nylon membrane or a glass slide. A Generation II DNAspotter (Molecular Dynamics) is one of the available technologies toarray the DNA samples onto the coated glass slides. Other array methodsare also available and well known in the art. After the cells are grownin various growth conditions or treated with potential candidates,cellular RNA is purified. Fluorescent or radioactive labeled target cDNAcan be made by reverse transcription of mRNA. The target mixture ishybridized to the probes and washed using conditions well known in theart. The amount of the target gene expression is quantified by theintensity of radioactivity or fluorescence labels (e.g., confocal lasermicroscope: Molecular Dynamics). The intensities of radioactivity orfluorescent label at the immobilized probes are measured usingtechnology well known in the art. The two color fluorescence detectionscheme (e.g., Cy3 and Cy5) has the advantage over radioactively labeledtargets by allowing rapid and simultaneous differential expressionanalysis of independent samples. In addition, the use of ratiomeasurements compensates for probe to probe variation of intensity dueto DNA concentration and hybridization efficiency. In the case offluorescence labeling, the two fluorescent images obtained with theappropriate excitation and emission filters constitute the raw data fromwhich differential gene expression ratio values are calculated. Theintensity of images are analyzed using the available software (e.g.,Array Vision 4.0: Imaging Research Inc.) well known in the art andnormalized to compensate for the differential efficiencies of labelingand detection of the label. There are many different ways known in theart to normalize the signals. One of the ways to normalize the signal isby correcting the signal against internal controls. Another way is torun a separate array with labeled genomic driven DNA and compare thesignal with mRNA driven signals. This method also allows measurement ofthe transcript abundance. The array data of individual genes is examinedand evaluated to determine the induction or repression of each geneunder the test conditions.

EXAMPLES

The present invention is further defined in the following Examples. Itshould be understood that these Examples, while indicating preferredembodiments of the invention, are given by way of illustration only.From the above discussion and these Examples, one skilled in the art canascertain the essential characteristics of this invention, and withoutdeparting from the spirit and scope thereof, can make various changesand modifications of the invention to adapt it to various usages andconditions.

General Methods

Standard recombinant DNA and molecular cloning techniques used in theExamples are well known in the art and are described by Sambrook, J.,Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual;Cold Spring Harbor Laboratory Press: Cold Spring Harbor, (1989)(Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist,Experiments with Gene Fusions, Cold Spring Harbor Laboratory, ColdSpring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., CurrentProtocols in Molecular Biology, pub. by Greene Publishing Assoc. andWiley-Interscience (1987).

Materials and methods suitable for the maintenance and growth ofbacterial cultures are well known in the art. Techniques suitable foruse in the following examples may be found as set out in Manual ofMethods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray,Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg andG. Briggs Phillips, eds), American Society for Microbiology, Washington,D.C. (1994)) or by Thomas D. Brock in Biotechnology: A Textbook ofIndustrial Microbiology, Second Edition, Sinauer Associates, Inc.,Sunderland, Mass. (1989). All reagents, restriction enzymes andmaterials used for the growth and maintenance of bacterial cells wereobtained from Aldrich Chemicals (Milwaukee, Wis.), DIFCO Laboratories(Detroit, Mich.), GIBCO/BRL (Gaithersburg, Md.), or Sigma ChemicalCompany (St. Louis, Mo.) unless otherwise specified.

Manipulations of genetic sequences were accomplished using the suite ofprograms available from the Genetics Computer Group Inc. (WisconsinPackage Version 9.0, Genetics Computer Group (GCG), Madison, Wis.).Where the GCG program “Pileup” was used the gap creation default valueof 12, and the gap extension default value of 4 were used. Where the CGC“Gap” or “Besffit” programs were used the default gap creation penaltyof 50 and the default gap extension penalty of 3 were used. In any caseswhere GCG program parameters were not prompted for, in these or anyother GCG program, default values were used.

Multiple alignment of the sequences was performed using the FASTAprogram incorporating the Smith-Waterman algorithm (W. R. Pearson,Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date1992, 111-20. Editor(s): Suhai, Sandor. Publisher: Plenum, New York,N.Y.).

The meaning of abbreviations is as follows: “h” means hour(s), “min”means minute(s), “sec” means second(s), “d” means day(s), “mL” meansmilliliters, “L” means liters.

Example 1 Isolation Of Methylomonas 16a

The original environmental sample containing the isolate was obtainedfrom pond sediment. The pond sediment was inoculated directly into adefined mineral medium under 25% methane in air. Methane was the solesource of carbon and energy. Growth was followed until the opticaldensity at 660 nm was stable whereupon the culture was transferred tofresh medium such that a 1:100 dilution was achieved. After 3 successivetransfers with methane as sole carbon and energy source the culture wasplated onto defined minimal medium agar and incubated under 25% methanein air. Many methanotrophic bacterial species were isolated in thismanner. However, Methylomonas 16a was selected as the organism to studydue to the rapid growth of colonies, large colony size, ability to growon minimal media, and pink pigmentation indicative of an activebiosynthetic pathway for carotenoids.

Example 2 Preparation of Genomic DNA for Sequencing and SegeunceGeneration

Genomic DNA was isolated from Methylomonas 16a according to standardprotocols.

Genomic DNA and library construction were prepared according topublished protocols (Fraser et al The Minimal Gene Complement ofMycoplasma genitalium; Science 270,1995). A cell pellet was resuspendedin a solution containing 100 mM Na-EDTA pH 8.0,10 mM tris-HCl pH 8.0,400 mM NaCl, and 50 mM MgCl2.

Genomic DNA preparation After resuspension, the cells were gently lysedin 10% SDS, and incubated for 30 minutes at 55° C. After incubation atroom temperature, proteinase K was added to 100 μg/ml and incubated at37° C. until the suspension was clear. DNA was extracted twice withtris-equilibrated phenol and twice with chloroform. DNA was precipitatedin 70% ethanol and resuspended in a solution containing 10 mM tris-HCland 1 mM Na-EDTA (TE) pH 7.5. The DNA solution was treated with a mix ofRNAases, then extracted twice with tris-equilibrated phenol and twicewith chloroform. This was followed by precipitation in ethanol andresuspension in TE.

Library construction 200 to 500 μg of chromosomal DNA was resuspended ina solution of 300 mM sodium acetate, 10 mM tris-HCl, 1 mM Na-EDTA, and30% glycerol, and sheared at 12 psi for 60 sec in an Aeromist DowndraftNebulizer chamber (IBI Medical products, Chicago, Ill.). The DNA wasprecipitated, resuspended and treated with Bal31 nuclease. After sizefractionation, a fraction (2.0 kb, or 5.0 kb) was excised, cleaned and atwo-step ligation procedure was used to produce a high titer librarywith greater than 99% single inserts.

Sequencing A shotgun sequencing strategy approach was adopted for thesequencing of the whole microbial genome (Fleischmann, Robert et alWhole-Genome Random sequencing and assembly of Haemophilus influenzae RdScience, 269: 1995).

Sequence was generated on an ABI Automatic sequencer using dyeterminator technology (U.S. Pat. No. 5,366,860; EP 272007) using acombination of vector and insert-specific primers. Sequence editing wasperformed in either DNAStar (DNA Star Inc.) or the Wisconsin GCG program(Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison,Wis.) and the CONSED package (version 7.0). All sequences representcoverage at least two times in both directions.

Example 3 Identification and Characterization of Bacteria ORF's

The carbon flux genes isolated from Methylomonas 16a were identified byconducting BLAST (Basic Local Alignment Search Tool; Altschul, S. F., etal., (1993) J. Mol. Biol. 215:403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequencescontained in the BLAST “nr” database (comprising all non-redundantGenBank CDS translations, sequences derived from the 3-dimensionalstructure Brookhaven Protein Data Bank, the SWISS-PROT protein sequencedatabase, EMBL, and DDBJ databases). The sequences obtained in Example 1were analyzed for similarity to all publicly available DNA sequencescontained in the “nr” database using the BLASTN algorithm provided bythe National Center for Biotechnology Information (NCBI). The DNAsequences were translated in all reading frames and compared forsimilarity to all publicly available protein sequences contained in the“nr” database using the BLASTX algorithm (Gish, W. and States, D. J.(1993) Nature Genetics 3:266-272) provided by the NCBI. All comparisonswere done using either the BLASTNnr or BLASTXnr algorithm. The resultsof the BLAST comparison is given in Table 1 which summarize thesequences to which they have the most similarity. Table 1 displays databased on the BLASTXnr algorithm with values reported in expect values.The Expect value estimates the statistical significance of the match,specifying the number of matches, with a given score, that are expectedin a search of a database of this size absolutely by chance.

TABLE 1 Genes Characterized From Methylomonas 16a Similarity SEQ ID % %Gene Name Identified SEQ ID Peptide Identity ^(a) Similarity ^(b)E-value ^(c) Citation Transaldolase Transaldolase 1 2 78% 90% 2.7e-92 Kohler, U., et al., GI:1729831 Plant Mol. Biol. 30 (1), Anabaena 213-218(1996) variabilis. MIPB- Transaldolase 3 4 50% 79%  1e-23 Blattner F.R.et. al transaldolase GI:7443254 Science 277:1453- Escherichia coli 1474(1997). FBA or FDA Fructose 5 6 76% 92% 4.1e-111 Alefounder P.R. et. al.bisphosphate Mol. Microbiol. 3:723- aldolase 732 (1989). FBA or FDAFructose 7 8 40% 70% 2.3e-39  van den Bergh E.R. bisphosphate et al.;aldolase J. Bacteriol. 178:888- 893 (1996). KHG/KDPG (AL352972) 9 10 59%72%  1e-64 Redenbach et al., Mol. Aldolase KHG/KDPG Microbiol. 21 (1),77-96 aldolase (1996) Streptomyces coelicolor Phosphogluco-Phosphoglucomutase 11 12 65% 85% 1.7e-140 Lepek et al., Direct mutase(Glucose Submission Phosphomutase) |gb|AAD03475.1| (Pgm)>>gi|3241933|gb|AAD03475.1| Glucose 6 Glucose 6 13 14 64% 81% 1.6e-136 Blattner etal., Nucleic phosphate phosphate Acids Res. 21 (23), isomerase isomerase5408-5417 (1993) gi|396360|gb|AAC4 3119.1 Phosphofructo-Phosphofructokinase 15 16 63% 83% 1.7e-97  Ladror et al., J. Biol.kinase pyrophosphate Chem. 266, 16550- pyrophosphate dependent 16555(1991) dependent gi|150931|gb|AAA2 5675.1| (M67447) 6-Phospho-6-Phosphogluconate 17 18 60% 85% 1.6e-141 Willis et al., J. Bacteriol.gluconate dehydratase 181 (14), 4176-4184 dehydratase gi|4210902|gb|AAD(1999) 12045.1| (AF045609) Glucose 6 Glucose 6 19 20 58% 85% 9.4e-123Hugouvieux-Cotte- phosphate 1 phosphate 1 Pattat, N, TITLE Directdehydrogenase dehydrogenase Submission, gi|397854|emb|CAgi|397854|emb|CAA528 A52858.1| 58.1| (X74866) (X74866) ^(a) % Identityis defined as percentage of amino acids that are identical between thetwo proteins. ^(b) % Similarity is defined as percentage of amino acidsthat are identical or conserved between the two proteins. ^(c) Expectvalue. The Expect value estimates the statistical significance of thematch, specifying the number of matches, with a given score, that areexpected in a search of a database of this size absolutely by chance

1. An isolated nucleic acid molecule encoding a fructose biphosphatealdolase enzyme, selected from the group consisting of: (a) an isolatednucleic acid molecule encoding the amino acid sequence as set forth inSEQ ID NO: 8, and (b) an isolated nucleic acid molecule that hybridizeswith (a) under the following hybridization conditions: 0.1×SSC, 0.1%SDS, 65° C. and washed with 2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1%SDS; or an isolated nucleic acid molecule that is complementary to (a)or (b).
 2. The isolated nucleic acid molecule of claim 1 as set forth inSEQ ID NO:7.
 3. An isolated nucleic acid molecule comprising a firstnucleotide sequence encoding a fructose biphosphate aldolase polypeptidethat has at least 95% identity when compared to a polypeptide having thesequence as set forth in SEQ ID NO:8, or a second nucleotide sequencecomprising the complement of the first nucleotide sequence.
 4. Achimeric gene comprising the isolated nucleic acid fragment of claim 1operably linked to suitable regulatory sequences.
 5. A transformed hostcell comprising a host cell and the chimeric gene of claim
 4. 6. Thetransformed host cell of claim 5 wherein the host cell is selected fromthe group consisting of bacteria, yeast, and filamentous fungi.
 7. Thetransformed host cell of claim 6 wherein the host cell is selected fromthe group consisting of Aspergillus, Saceharomyces, Pichia, Candida,Hansenula, Salmonella, Bacillus, Acinetobacter, Rhodococcus,Streptomyces, Escherichia, Pseudomonas, Methylomonas, Methylocoecs andMethylobacter.